We collected data from a total of N = 389 participants using a Prolific representative sample that was stratified on the basis of age, sex, ethnicity, and political affiliation. Following our exclusion criteria we had to exclude 28 participants who inferred the wrong category in five or more non-target inference items and one participant who rated their own data to be unfit for analysis. This resulted in N = 360 valid datasets (175 men, 180 women, 4 non-binary, 1 n/a; median age Mdn = 46 years, ranging from 19 to 79). 42% of participants identified as Democrats, 23% as Independents, 33% as Republicans, and 2% as other. According to Prolific demographics, more participants identified as Independents (31% Democrats, 42% Independents, 27% Republicans)–possibly because there was no “Independent” option in our survey. Participants received monetary compensation of 1.20 GBP for completing the 8-minute study. Participants’ mean political ideology was M = 54.6 (SD = 31.6), on a scale from 0 = conservative to 100 = liberal. 80 additional people started the experiment on prolific but returned their submission, timed-out, did not meet the inclusion criteria, or failed the comprehension check.
We did not preregister exclusion of statistical outliers. However, upon inspection of the data we noticed that there were many data points that were very unlikely to reflect true inferences. That is, some participants in both the stereotype-disconfirming and the stereotype-confirming condition seemed to infer categories from opinions that are stereotypically associated with the opposing category (e.g., inferring from a pro-gun control opinion that the person must be a Republican). We assume that such responses are likely due to the participants understanding the items the wrong way around or responding at random. To investigate the potential effect of these outliers we conducted additional analyses excluding them. Theoretically, any values below the scale midpoint (50) are very unlikely to reflect genuine inferences, because they would indicate that the category that is not stereotypically associated with the opinion is more likely to express the opinion than the category that is stereotypically associated with it. While it would therefore seem appropriate to exclude any values below 50, we decided to exclude values below 40, to account for the fact that participants may respond somewhat inaccurately using the sliding scale. For the main analyses we report tests with and without this exclusion criterion.
A similar problem occurred for the stereotype measure on which some participant indicated that category members would on average strongly disagree with opinions that are stereotypical for their category. However, for this measure there is less of a clear cutoff, because even an opinion that is very stereotypical or diagnostic for a category must not be shared by a majority of its members (as long as it is shared by considerably less members of the opposing category). To nevertheless investigate the effect of these outliers we conducted a multiverse analysis using several different cutoff criteria.
# Stereotype data
stereo_dat <- dat %>%
filter(item_type == "target") %>%
filter(diagnosticity_component == "COUNTERPROB")
stereo_dat %>%
ggplot(., aes(typicality, stereo, color = category_type)) +
geom_boxplot(outlier.shape = NA) +
geom_point(position = position_jitterdodge()) +
labs(x = "Typicality", y = "Stereotype", color = "Category Type") +
scale_color_manual(values = c("#849AB9", "#465263")) +
theme_cs_talk()
# Loop over all combinations of cutoffs and transformations
mv_stereo <- data.frame(
cutoff = c("none", "fixed 1", "fixed 10", "fixed 20", "fixed 30", "fixed 40"),
cutoff_value = c(0, 1, 10, 20, 30, 40))
for (c in seq_len(nrow(mv_stereo))) {
mv_dat <- stereo_dat %>%
filter(stereo >= mv_stereo$cutoff_value[c])
# Run ANOVA
mv_mod <- mv_dat %>%
anova_test(dv = stereo,
effect.size = "pes",
between = c(category_type, typicality)) %>%
as_tibble() %>%
rowwise() %>%
mutate(F = forma(`F`, 2), p = formp(p), pes = forma(pes, 3, FALSE))
# Save test statistics for h1
mv_stereo$`n excluded`[c] <-
nrow(stereo_dat) - nrow(mv_dat)
mv_stereo$ct_DFn[c] <- mv_mod$DFn[1]
mv_stereo$ct_DFd[c] <- mv_mod$DFd[1]
mv_stereo$ct_F[c] <- mv_mod$F[1]
mv_stereo$ct_p[c] <- mv_mod$p[1]
mv_stereo$ct_pes[c] <- mv_mod$pes[1]
mv_stereo$typ_DFn[c] <- mv_mod$DFn[2]
mv_stereo$typ_DFd[c] <- mv_mod$DFd[2]
mv_stereo$typ_F[c] <- mv_mod$F[2]
mv_stereo$typ_p[c] <- mv_mod$p[2]
mv_stereo$typ_pes[c] <- mv_mod$pes[2]
mv_stereo$int_DFn[c] <- mv_mod$DFn[3]
mv_stereo$int_DFd[c] <- mv_mod$DFd[3]
mv_stereo$int_F[c] <- mv_mod$F[3]
mv_stereo$int_p[c] <- mv_mod$p[3]
mv_stereo$int_pes[c] <- mv_mod$pes[3]
}
# Print multiverse table
knitr::kable(select(mv_stereo, - cutoff_value),
format = "markdown")
| cutoff | n excluded | ct_DFn | ct_DFd | ct_F | ct_p | ct_pes | typ_DFn | typ_DFd | typ_F | typ_p | typ_pes | int_DFn | int_DFd | int_F | int_p | int_pes |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| none | 0 | 1 | 356 | 0.01 | .918 | .000 | 1 | 356 | 3.32 | .069 | .009 | 1 | 356 | 0.58 | .447 | .002 |
| fixed 1 | 13 | 1 | 343 | 0.06 | .800 | .000 | 1 | 343 | 7.62 | .006 | .022 | 1 | 343 | 2.41 | .121 | .007 |
| fixed 10 | 19 | 1 | 337 | 0.00 | .996 | .000 | 1 | 337 | 7.01 | .009 | .020 | 1 | 337 | 1.89 | .171 | .006 |
| fixed 20 | 33 | 1 | 323 | 0.13 | .718 | .000 | 1 | 323 | 10.81 | .001 | .032 | 1 | 323 | 0.03 | .871 | .000 |
| fixed 30 | 59 | 1 | 297 | 1.43 | .233 | .005 | 1 | 297 | 6.71 | .010 | .022 | 1 | 297 | 0.32 | .570 | .001 |
| fixed 40 | 70 | 1 | 286 | 0.29 | .591 | .001 | 1 | 286 | 5.85 | .016 | .020 | 1 | 286 | 0.52 | .471 | .002 |
To test whether the manipulation was successful, we conducted a multiverse analysis with multiple fixed outlier cutoffs. We excluded participants with values below 1, 10, 20, 30, or 40 and conducted a two-way between-subjects ANOVA with the stereotype measure as the dependent variable and typicality and category type as between-subjects factors. The effect of typicality was marginally significant with no outlier exclusion and significant with all different outlier thresholds. The effect of category type and the two-way interaction were not significant in any of the tests. These results provide moderate support for the effectiveness of our manipulation.
stereo_dat %>%
ggplot(., aes(sample = stereo)) +
labs(x = "Theoretical quantiles", y = "Data quantiles") +
stat_qq(color = "#000000") +
stat_qq_line(color = "#000000") +
facet_grid(category_type ~ typicality, labeller = "label_value") +
theme_cs_talk()
# PREREGISTERED
# Inference data
inf_dat <- dat %>%
filter(diagnosticity_component == "COUNTERPROB") %>%
filter(category_type == "PARTY") %>%
filter(item_type == "target")
# Descriptives
inf_desc <- inf_dat %>%
group_by(typicality) %>%
get_summary_stats(inf, type = "mean_sd") %>%
mutate(
ci95_low = mean - 1.96 * sd / sqrt(n),
ci95_upp = mean + 1.96 * sd / sqrt(n))
# Independent samples t-test
inf_t <- inf_dat %>%
t_test(
inf ~ typicality,
alternative = "less",
ref.group = "DIS") %>%
mutate(p = formp(p, text = TRUE), df = forma(df, 2),
statistic = forma(statistic))
# Cohens d
inf_d <- inf_dat %>%
cohens_d(inf ~ typicality, paired = FALSE) %>%
pull(effsize) %>%
forma()
# Print t-test
inf_report <- paste0("t(", inf_t$df, ") = ",
inf_t$statistic, ", ", inf_t$p, ", d = ",
inf_d)
# WITH OUTLIER EXCLUSION
# Exclude outliers
inf_dat_ex <- inf_dat %>%
filter(inf >= 40)
# Descriptives
inf_ex_desc <- inf_dat_ex %>%
group_by(typicality) %>%
get_summary_stats(inf, type = "mean_sd") %>%
mutate(
ci95_low = mean - 1.96 * sd / sqrt(n),
ci95_upp = mean + 1.96 * sd / sqrt(n))
# Independent samples t-test
inf_ex_t <- inf_dat_ex %>%
t_test(
inf ~ typicality,
alternative = "less",
ref.group = "DIS") %>%
mutate(p = formp(p, text = TRUE), df = forma(df, 2),
statistic = forma(statistic))
# Cohens d
inf_ex_d <- inf_dat_ex %>%
cohens_d(inf ~ typicality, paired = FALSE) %>%
pull(effsize) %>%
forma()
# Print t-test
inf_ex_report <- paste0("t(", inf_ex_t$df, ") = ",
inf_ex_t$statistic, ", ", inf_ex_t$p, ", d = ",
inf_ex_d)
# WILCOXON
# One-sided Wilcoxon rank-sum test
inf_wcx <- inf_dat %>%
wilcox_test(
inf ~ typicality,
alternative = "less",
ref.group = "DIS")
# Wilcoxon effect size r
inf_wcx_r <- inf_dat %>%
wilcox_effsize(inf ~ typicality,
ref.group = "DIS") %>%
pull(effsize)
# Print wilcoxon test
inf_wcx_report <- paste0("Z = ", forma(inf_wcx_r * sqrt(nrow(inf_dat) / 2)),
", ", formp(inf_wcx$p, TRUE), ", r = ", forma(inf_wcx_r))
To test our hypothesis in the party category condition, we conducted a preregistered independent samples t-test comparing target inferences of partisanship in the stereotype-confirming and stereotype-disconfirming conditions. The target inferences in the disconfirming condition (M = 77.6, SD = 19.6) were not significantly lower than in the stereotype-confirming condition (M = 81.4, SD = 20.9), t(177.22) = -1.27, p = .104, d = -0.19. To investigate whether this was due to statistical outliers, we conducted another independent samples t-test excluding participants with values below 40. After exclusion, the inferences in the disconfirming condition (M = 80.2, SD = 15.6) were significantly lower than in the confirming condition (M = 85.3, SD = 13.6), t(166.51) = -2.27, p = .012, d = -0.35.
inf_dat %>%
ggplot(., aes(typicality, inf)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter(width = 0.2) +
labs(x = "Typicality", y = "Inferences") +
theme_cs_talk()
# Loop over all combinations of cutoffs and transformations
mv <- data.frame(
cutoff = c("none", "fixed 20", "fixed 30", "fixed 40",
"Mdn ± 2.5 IQR", "Mdn ± 2.0 IQR", "Mdn ± 1.5 IQR", "Mdn ± 1.0 IQR"),
cutoff_type = c("fixed", "fixed", "fixed", "fixed",
"Mdn ± IQR", "Mdn ± IQR", "Mdn ± IQR", "Mdn ± IQR"),
cutoff_value = c(0, 20, 30, 40, 2.5, 2.0, 1.5, 1.0))
for (c in seq_len(nrow(mv))) {
if (mv$cutoff_type[c] == "fixed") {
mv_dat <- inf_dat %>%
filter(inf >= mv$cutoff_value[c])
} else if (mv$cutoff_type[c] == "Mdn ± IQR") {
mv_dat <- inf_dat %>%
group_by(typicality) %>%
filter(!is_outlier(inf, coef = mv$cutoff_value[c])) %>%
ungroup()
}
# Independent samples t-test
mv_t <- mv_dat %>%
t_test(
inf ~ typicality,
alternative = "less",
ref.group = "DIS") %>%
mutate(p = formp(p, text = FALSE), df = forma(df, 2),
statistic = forma(statistic))
# Cohens d
mv_d <- mv_dat %>%
cohens_d(inf ~ typicality) %>%
pull(effsize) %>%
forma()
# Save test statistics for h1
mv$`n excluded`[c] <- nrow(inf_dat) - nrow(mv_dat)
mv$df[c] <- mv_t$df
mv$t[c] <- mv_t$statistic
mv$p[c] <- mv_t$p
mv$d[c] <- mv_d
}
# Print multiverse table
knitr::kable(select(mv, -cutoff_type, - cutoff_value), format = "markdown")
| cutoff | n excluded | df | t | p | d |
|---|---|---|---|---|---|
| none | 0 | 177.22 | -1.27 | .104 | -0.19 |
| fixed 20 | 5 | 172.69 | -1.91 | .029 | -0.29 |
| fixed 30 | 7 | 169.77 | -2.10 | .018 | -0.32 |
| fixed 40 | 9 | 166.51 | -2.27 | .012 | -0.35 |
| Mdn ± 2.5 IQR | 2 | 174.93 | -1.98 | .025 | -0.30 |
| Mdn ± 2.0 IQR | 6 | 169.41 | -2.30 | .011 | -0.35 |
| Mdn ± 1.5 IQR | 8 | 166.04 | -2.45 | .008 | -0.37 |
| Mdn ± 1.0 IQR | 13 | 153.30 | -3.32 | <.001 | -0.51 |
To investigate the robustness of the effect across various outlier exclusion criteria, we conducted a multiverse analysis. We performed independent samples t-tests across several fixed and distribution-based thresholds. For the fixed thresholds we excluded ratings below 15, 30, or 45 uniformly across both experimental conditions. For the distribution-based criteria we excluded ratings that were 1.0, 1.5, 2.0, or 2.5 times the IQR above or below the condition median. The effect was significant across all seven thresholds. These results suggest that the effect may have been obscured due to model outliers. However, removing outliers may also have introduced bias, for instance if we inadvertantly removed more true model outliers in one of the two experimental conditions. This risk is especially high with the distribution-based thresholds we used: If in the disconfirming condition the distribution of inferences is shifted towards the scale midpoint, this would make low outliers more difficult to detect (due to a floor effect). Thus, applying the thresholds may have artificially inflated the effect by predominantly removing low outliers in the confirming condition. As this risk is reduced by the fixed thresholds, these provide a more conservative estimate of the true effect.
Although these results suggest that there may be an effect of typicality on inferences of partisanship, this effect seems to be considerably smaller and less robust compared to the previous study. This may be because we removed the pre-measurement and changed the cover story. As we have argued above, the repeated-measures design made it very easy to infer our research hypothesis. Part of the observed effect in the previous study may therefore have been the consequence of demand effects. Another possibility is that measuring the inferences before presenting the counterstereotypical exemplar led to an activation of the relevant knowledge structures and therefore facilitated the learning of the new information. However, as neither demand characteristics nor pre-activation of the relevant knowledge structures will be present when we come across a counterstereotype in real life, the effects of the current study may provide a more realistic estimate of the change in inferences that is induced by counterstereotypes.
inf_dat_ex %>%
ggplot(., aes(sample = inf)) +
labs(x = "Theoretical quantiles", y = "Data quantiles") +
stat_qq(color = "#000000") +
stat_qq_line(color = "#000000") +
facet_grid(~ typicality, labeller = "label_value") +
theme_cs_talk()
knitr::kable(inf_desc, format = "markdown")
| typicality | variable | n | mean | sd | ci95_low | ci95_upp |
|---|---|---|---|---|---|---|
| DIS | inf | 90 | 77.600 | 19.569 | 73.55700 | 81.64300 |
| CON | inf | 90 | 81.422 | 20.913 | 77.10133 | 85.74267 |
inf_dat_ex %>%
mutate(inf_bin = cut(inf,
breaks = 9, labels = FALSE, include.lowest = TRUE)) %>%
ggplot() +
geom_bar(position = position_dodge(preserve = "single"),
aes(x = inf_bin, y = after_stat(prop), fill = typicality),
width = 0.8) +
labs(title = "Histogramm of Inference Scores",
x = "Inference Scores",
y = "Proportion", fill = "Typicality") +
scale_fill_manual(values = c("#849AB9", "#465263")) +
theme_cs_talk() +
theme(axis.text.x = element_blank())
# Prepare data
inf_par_dat <- inf_dat_ex %>%
# Select participants who identify as Republican, Democrat, or Independent
filter(partisan_identity %in% c("Republican", "Democrat", "Independent"))
# Descriptives
inf_par_desc <- inf_par_dat %>%
group_by(partisan_identity, typicality) %>%
get_summary_stats(inf, type = "mean_sd") %>%
mutate(
ci95_low = mean - 1.96 * sd / sqrt(n),
ci95_upp = mean + 1.96 * sd / sqrt(n))
# Run ANOVA
inf_par_mod <- inf_par_dat %>%
anova_test(dv = inf,
effect.size = "pes",
between = c(partisan_identity, typicality)) %>%
as_tibble() %>%
rowwise() %>%
mutate(F = forma(`F`, 2), p = formp(p), pes = forma(pes, 3, FALSE))
knitr::kable(inf_par_mod, format = "markdown")
| Effect | DFn | DFd | F | p | p<.05 | pes |
|---|---|---|---|---|---|---|
| partisan_identity | 2 | 161 | 1.66 | .193 | .020 | |
| typicality | 1 | 161 | 6.13 | .014 | * | .037 |
| partisan_identity:typicality | 2 | 161 | 0.49 | .616 | .006 |
It may be possible that the effect of diagnosticity on the inferences depends on the participant’s partisanship. To test this, we conducted a two-way between-subjects ANOVA with the target inference as the dependent variable and the participants’ party identification and typicality as between-subjects factors. We found a significant effect of typicality. All other effects were non-significant.
# Prepare data
inf_par_fit_dat <- inf_dat_ex %>%
# Select participants who identify with one of the major parties
filter(partisan_identity %in% c("Republican", "Democrat")) %>%
# Recode partisan identity
mutate(partisan_identity = case_match(partisan_identity,
"Democrat" ~ "DEM", "Republican" ~ "REP")) %>%
mutate(partisan_fit = as.factor(ifelse(
partisan_identity == target_category_label,
"same", "different")))
# Descriptives
inf_par_fit_desc <- inf_par_fit_dat %>%
group_by(partisan_fit, typicality) %>%
get_summary_stats(inf, type = "mean_sd") %>%
mutate(
ci95_low = mean - 1.96 * sd / sqrt(n),
ci95_upp = mean + 1.96 * sd / sqrt(n))
# Run ANOVA
inf_par_fit_mod <- inf_par_fit_dat %>%
anova_test(dv = inf,
effect.size = "pes",
between = c(partisan_fit, typicality)) %>%
as_tibble() %>%
rowwise() %>%
mutate(F = forma(`F`, 2), p = formp(p), pes = forma(pes, 3, FALSE))
knitr::kable(inf_par_fit_mod, format = "markdown")
| Effect | DFn | DFd | F | p | p<.05 | pes |
|---|---|---|---|---|---|---|
| partisan_fit | 1 | 122 | 0.00 | .976 | .000 | |
| typicality | 1 | 122 | 6.32 | .013 | * | .049 |
| partisan_fit:typicality | 1 | 122 | 0.42 | .520 | .003 |
It may be possible that the effect of diagnosticity on the inferences depends on the fit between the participant’s party identification and the target’s partisanship. We therefore ran an exploratory analysis examining the role of ideological fit (i.e., whether participant and target belong to the same category or a different one). We excluded participants classified as Independents and conducted a two-way between-subjects ANOVA with the target inference as the dependent variable and ideological fit and typicality as between-subjects factors. We found a significant effect of typicality. All other effects were non-significant.
# Descriptives
inf_issue_desc <- inf_dat_ex %>%
group_by(issue, typicality) %>%
get_summary_stats(inf, type = "mean_sd") %>%
mutate(
ci95_low = mean - 1.96 * sd / sqrt(n),
ci95_upp = mean + 1.96 * sd / sqrt(n))
# Run ANOVA
inf_issue_mod <- inf_dat_ex %>%
anova_test(dv = inf,
effect.size = "pes",
between = c(issue, typicality)) %>%
as_tibble() %>%
rowwise() %>%
mutate(F = forma(`F`, 2), p = formp(p), pes = forma(pes, 3, FALSE))
knitr::kable(inf_issue_mod, format = "markdown")
| Effect | DFn | DFd | F | p | p<.05 | pes |
|---|---|---|---|---|---|---|
| issue | 2 | 165 | 2.54 | .082 | .030 | |
| typicality | 1 | 165 | 5.53 | .020 | * | .032 |
| issue:typicality | 2 | 165 | 0.67 | .513 | .008 |
We ran another exploratory analysis to test whether the effect of diagnosticity on the inferences was moderated by the issue for which diagnosticity was manipulated. We conducted a two-way between-subjects ANOVA with the target inference as the dependent variable and issue and typicality as between-subjects factors. We found a significant effect of typicality. All other effects were non-significant.
inf_typ_dat <- inf_dat_ex %>%
group_by(typicality) %>%
mutate(typicality_bin = ntile(typicality_target, 2)) %>%
mutate(typicality_bin = factor(typicality_bin, labels = c("low", "high"))) %>%
ungroup() %>%
mutate(typicality_target_c = typicality_target - mean(typicality_target))
# DV: Typicality
# Descriptives
typ_desc <- inf_typ_dat %>%
group_by(typicality) %>%
get_summary_stats(typicality_target, type = "mean_sd") %>%
mutate(
ci95_low = mean - 1.96 * sd / sqrt(n),
ci95_upp = mean + 1.96 * sd / sqrt(n))
# Independent samples t-test
typ_t <- inf_typ_dat %>%
t_test(
typicality_target ~ typicality,
alternative = "less",
ref.group = "DIS") %>%
mutate(p = formp(p, text = TRUE), df = forma(df, 2),
statistic = forma(statistic))
# Cohens d
typ_d <- inf_typ_dat %>%
cohens_d(typicality_target ~ typicality, paired = FALSE) %>%
pull(effsize) %>%
forma()
# Print t-test
typ_report <- paste0("t(", typ_t$df, ") = ",
typ_t$statistic, ", ", typ_t$p, ", d = ",
typ_d)
# DV: Inferences
inf_typ_desc <- inf_typ_dat %>%
group_by(typicality_bin, typicality) %>%
get_summary_stats(inf, type = "mean_sd") %>%
mutate(
ci95_low = mean - 1.96 * sd / sqrt(n),
ci95_upp = mean + 1.96 * sd / sqrt(n))
# Run regression
contrasts(inf_typ_dat$typicality) <- c(1, 0)
inf_typ_mod <- lm(
data = inf_typ_dat,
inf ~ typicality * typicality_target_c)
inf_typ_mod_coefs <- summary(inf_typ_mod)$coefficients %>%
as_tibble() %>%
rowwise() %>%
transmute(
predictor = "var", estimate = forma(`Estimate`, 2),
SE = forma(`Std. Error`, 2), t = forma(`t value`, 2),
p = formp(`Pr(>|t|)`)) %>%
ungroup() %>%
mutate(predictor = c("intercept", "typicality CON -> DIS",
"typicality rating LOW -> HIGH", "interaction"))
knitr::kable(inf_typ_mod_coefs, format = "markdown")
| predictor | estimate | SE | t | p |
|---|---|---|---|---|
| intercept | 81.17 | 2.04 | 39.72 | <.001 |
| typicality CON -> DIS | -1.86 | 2.71 | -0.68 | .495 |
| typicality rating LOW -> HIGH | 0.29 | 0.10 | 3.09 | .002 |
| interaction | -0.36 | 0.12 | -3.11 | .002 |
# Plot
inf_typ_dat %>%
ggplot(aes(x = typicality_target, y = inf, color = typicality)) +
geom_point(alpha = 0.6) +
geom_smooth(method = "lm", se = TRUE,
aes(group = typicality)) +
labs(
title = "Moderation by typicality rating",
x = "Typicality rating",
y = "Inference",
color = "Typicality") +
scale_color_manual(values = c("#2d85ff", "#28ea96")) +
theme_cs_talk()
The target person’s perceived typicality in the disconfirming condition (M = 53.6, SD = 23.9) was significantly lower than in the confirming condition (M = 81.4, SD = 16.4), t(150.95) = -8.88, p < .001, d = -1.36. To explore the effect of the typicality ratings, we ran a linear regression model with the inferences as the dependent variable and typicality, the typicality rating, and the interaction as the predictors. We used treatment-coding for typicality and centered the typicality ratings. The overall model was significant, F(3, 167) = 5.31, p = .002, R2 = .087. The main effect of typicality was not significant, b = -1.86, SE = 2.71, t(167) = -0.68, p = .495. The main effect of perceived typicality was significant, b = 0.29, SE = 0.10, t(167) = 3.09, p = .002, indicating that in the stereotype-confirming condition higher perceived typicality was associated with stronger inferences. Importantly, the interaction effect was significant, b = -0.36, SE = 0.12, t(167) = -3.11, p = .002. This interaction indicates that in the disconfirming condition, a higher perceived typicality is associated with reduced inferences compared to the confirming condition.
# Inference data
inf_stereoprob_dat <- dat %>%
filter(diagnosticity_component == "STEREOPROB") %>%
filter(category_type == "PARTY") %>%
filter(item_type == "target")
# Descriptives
inf_stereoprob_desc <- inf_stereoprob_dat %>%
group_by(typicality) %>%
get_summary_stats(inf, type = "mean_sd") %>%
mutate(
ci95_low = mean - 1.96 * sd / sqrt(n),
ci95_upp = mean + 1.96 * sd / sqrt(n))
# Independent samples t-test
inf_stereoprob_t <- inf_stereoprob_dat %>%
t_test(
inf ~ typicality,
alternative = "less",
ref.group = "DIS") %>%
mutate(p = formp(p, text = TRUE), df = forma(df, 2),
statistic = forma(statistic))
# Cohens d
inf_stereoprob_d <- inf_stereoprob_dat %>%
cohens_d(inf ~ typicality, paired = FALSE) %>%
pull(effsize) %>%
forma()
# Print t-test
inf_stereoprob_report <- paste0("t(", inf_stereoprob_t$df, ") = ",
inf_stereoprob_t$statistic, ", ", inf_stereoprob_t$p, ", d = ",
inf_stereoprob_d)
# EXCLUDE OUTLIERS
inf_stereoprob_dat_ex <- inf_stereoprob_dat %>%
filter(inf >= 40)
# Descriptives
inf_stereoprob_ex_desc <- inf_stereoprob_dat_ex %>%
group_by(typicality) %>%
get_summary_stats(inf, type = "mean_sd") %>%
mutate(
ci95_low = mean - 1.96 * sd / sqrt(n),
ci95_upp = mean + 1.96 * sd / sqrt(n))
# Independent samples t-test
inf_stereoprob_ex_t <- inf_stereoprob_dat_ex %>%
t_test(
inf ~ typicality,
alternative = "less",
ref.group = "DIS") %>%
mutate(p = formp(p, text = TRUE), df = forma(df, 2),
statistic = forma(statistic))
# Cohens d
inf_stereoprob_ex_d <- inf_stereoprob_dat_ex %>%
cohens_d(inf ~ typicality, paired = FALSE) %>%
pull(effsize) %>%
forma()
# Print t-test
inf_stereoprob_ex_report <- paste0("t(", inf_stereoprob_ex_t$df, ") = ",
inf_stereoprob_ex_t$statistic, ", ", inf_stereoprob_ex_t$p, ", d = ",
inf_stereoprob_ex_d)
# WILCOXON
# One-sided Wilcoxon rank-sum test
inf_stereoprob_wcx <- inf_stereoprob_dat %>%
wilcox_test(
inf ~ typicality,
alternative = "less",
ref.group = "DIS")
# Wilcoxon effect size r
inf_stereoprob_wcx_r <- inf_stereoprob_dat %>%
wilcox_effsize(inf ~ typicality,
ref.group = "DIS") %>%
pull(effsize)
# Print wilcoxon test
inf_stereoprob_wcx_report <- paste0("Z = ",
forma(inf_stereoprob_wcx_r * sqrt(nrow(inf_stereoprob_dat) / 2)),
", ", formp(inf_stereoprob_wcx$p, TRUE), ", r = ", forma(inf_stereoprob_wcx_r))
To test the effect of manipulating the stereotype probability component of diagnosticity on inferences of partisanship, we conducted an exploratory independent samples t-test comparing target inferences of partisanship in the stereotype-confirming and stereotype-disconfirming conditions. The target inferences in the disconfirming condition (M = 80.2, SD = 22.2) were not significantly lower than in the stereotype-confirming condition (M = 80.4, SD = 24.5), t(176.35) = -0.064, p = .475, d = -0.010. To investigate whether this was due to statistical outliers, we conducted another independent samples t-test excluding participants with values below 40. After exclusion, the inferences in the disconfirming condition (M = 84.7, SD = 14.8) were still not significantly lower than in the confirming condition (M = 87.4, SD = 13.0), t(161.53) = -1.23, p = .110, d = -0.19.
inf_stereoprob_dat %>%
ggplot(., aes(typicality, inf)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter(width = 0.2) +
labs(x = "Typicality", y = "Inferences") +
theme_cs_talk()
# Loop over all combinations of cutoffs and transformations
mv_stereoprob <- data.frame(
cutoff = c("none", "fixed 20", "fixed 30", "fixed 40",
"Mdn ± 2.5 IQR", "Mdn ± 2.0 IQR", "Mdn ± 1.5 IQR", "Mdn ± 1.0 IQR"),
cutoff_type = c("fixed", "fixed", "fixed", "fixed",
"Mdn ± IQR", "Mdn ± IQR", "Mdn ± IQR", "Mdn ± IQR"),
cutoff_value = c(0, 20, 30, 40, 2.5, 2.0, 1.5, 1.0))
for (c in seq_len(nrow(mv_stereoprob))) {
if (mv_stereoprob$cutoff_type[c] == "fixed") {
mv_dat <- inf_stereoprob_dat %>%
filter(inf >= mv_stereoprob$cutoff_value[c])
} else if (mv_stereoprob$cutoff_type[c] == "Mdn ± IQR") {
mv_dat <- inf_stereoprob_dat %>%
group_by(typicality) %>%
filter(!is_outlier(inf, coef = mv_stereoprob$cutoff_value[c])) %>%
ungroup()
}
# Independent samples t-test
mv_t <- mv_dat %>%
t_test(
inf ~ typicality,
alternative = "less",
ref.group = "DIS") %>%
mutate(p = formp(p, text = FALSE), df = forma(df, 2),
statistic = forma(statistic))
# Cohens d
mv_d <- mv_dat %>%
cohens_d(inf ~ typicality) %>%
pull(effsize) %>%
forma()
# Save test statistics for h1
mv_stereoprob$`n excluded`[c] <- nrow(inf_stereoprob_dat) - nrow(mv_dat)
mv_stereoprob$df[c] <- mv_t$df
mv_stereoprob$t[c] <- mv_t$statistic
mv_stereoprob$p[c] <- mv_t$p
mv_stereoprob$d[c] <- mv_d
}
# Print multiverse table
knitr::kable(select(mv_stereoprob, -cutoff_type, - cutoff_value),
format = "markdown")
| cutoff | n excluded | df | t | p | d |
|---|---|---|---|---|---|
| none | 0 | 176.35 | -0.064 | .475 | -0.010 |
| fixed 20 | 8 | 169.99 | -0.70 | .243 | -0.11 |
| fixed 30 | 13 | 164.77 | -0.60 | .275 | -0.093 |
| fixed 40 | 15 | 161.53 | -1.23 | .110 | -0.19 |
| Mdn ± 2.5 IQR | 3 | 174.34 | -0.34 | .366 | -0.052 |
| Mdn ± 2.0 IQR | 9 | 168.50 | -1.01 | .158 | -0.15 |
| Mdn ± 1.5 IQR | 14 | 163.79 | -0.93 | .177 | -0.14 |
| Mdn ± 1.0 IQR | 16 | 161.45 | -1.01 | .157 | -0.16 |
To investigate the robustness of the effect across various outlier exclusion criteria, we conducted a multiverse analysis. We again performed independent samples t-tests across several fixed and distribution-based thresholds. The effect was non-significant across all seven thresholds. These results suggest that manipulating the stereotype probability has no effect on inferences of partisanship.
inf_stereoprob_dat %>%
ggplot(., aes(sample = inf)) +
labs(x = "Theoretical quantiles", y = "Data quantiles") +
stat_qq(color = "#000000") +
stat_qq_line(color = "#000000") +
facet_grid(~ typicality, labeller = "label_value") +
theme_cs_talk()
knitr::kable(inf_stereoprob_desc, format = "markdown")
| typicality | variable | n | mean | sd | ci95_low | ci95_upp |
|---|---|---|---|---|---|---|
| DIS | inf | 90 | 80.222 | 22.219 | 75.63151 | 84.81249 |
| CON | inf | 90 | 80.444 | 24.485 | 75.38535 | 85.50265 |
inf_stereoprob_dat %>%
mutate(inf_bin = cut(inf,
breaks = 9, labels = FALSE, include.lowest = TRUE)) %>%
ggplot() +
geom_bar(position = position_dodge(preserve = "single"),
aes(x = inf_bin, y = after_stat(prop), fill = typicality),
width = 0.8) +
labs(title = "Histogramm of Inference Scores",
x = "Inference Scores",
y = "Proportion", fill = "Typicality") +
scale_fill_manual(values = c("#849AB9", "#465263")) +
theme_cs_talk() +
theme(axis.text.x = element_blank())
# PREREGISTERED
# Inference data
inf_ideo_dat <- dat %>%
filter(diagnosticity_component == "COUNTERPROB") %>%
filter(category_type == "IDEOLOGICAL") %>%
filter(item_type == "target")
# Descriptives
inf_ideo_desc <- inf_ideo_dat %>%
group_by(typicality) %>%
get_summary_stats(inf, type = "mean_sd") %>%
mutate(
ci95_low = mean - 1.96 * sd / sqrt(n),
ci95_upp = mean + 1.96 * sd / sqrt(n))
# Independent samples t-test
inf_ideo_t <- inf_ideo_dat %>%
t_test(
inf ~ typicality,
alternative = "less",
ref.group = "DIS") %>%
mutate(p = formp(p, text = TRUE), df = forma(df, 2),
statistic = forma(statistic))
# Cohens d
inf_ideo_d <- inf_ideo_dat %>%
cohens_d(inf ~ typicality, paired = FALSE) %>%
pull(effsize) %>%
forma()
# Print t-test
inf_ideo_report <- paste0("t(", inf_ideo_t$df, ") = ",
inf_ideo_t$statistic, ", ", inf_ideo_t$p, ", d = ",
inf_ideo_d)
# EXCLUDE OUTLIERS
inf_ideo_dat_ex <- inf_ideo_dat %>%
filter(inf >= 40)
# Descriptives
inf_ideo_ex_desc <- inf_ideo_dat_ex %>%
group_by(typicality) %>%
get_summary_stats(inf, type = "mean_sd") %>%
mutate(
ci95_low = mean - 1.96 * sd / sqrt(n),
ci95_upp = mean + 1.96 * sd / sqrt(n))
# Independent samples t-test
inf_ideo_ex_t <- inf_ideo_dat_ex %>%
t_test(
inf ~ typicality,
alternative = "less",
ref.group = "DIS") %>%
mutate(p = formp(p, text = TRUE), df = forma(df, 2),
statistic = forma(statistic))
# Cohens d
inf_ideo_ex_d <- inf_ideo_dat_ex %>%
cohens_d(inf ~ typicality, paired = FALSE) %>%
pull(effsize) %>%
forma()
# Print t-test
inf_ideo_ex_report <- paste0("t(", inf_ideo_ex_t$df, ") = ",
inf_ideo_ex_t$statistic, ", ", inf_ideo_ex_t$p, ", d = ",
inf_ideo_ex_d)
# WILCOXON
# One-sided Wilcoxon rank-sum test
inf_ideo_wcx <- inf_ideo_dat %>%
wilcox_test(
inf ~ typicality,
alternative = "less",
ref.group = "DIS")
# Wilcoxon effect size r
inf_ideo_wcx_r <- inf_ideo_dat %>%
wilcox_effsize(inf ~ typicality,
ref.group = "DIS") %>%
pull(effsize)
# Print wilcoxon test
inf_ideo_wcx_report <- paste0("Z = ",
forma(inf_ideo_wcx_r * sqrt(nrow(inf_ideo_dat) / 2)),
", ", formp(inf_ideo_wcx$p, TRUE), ", r = ", forma(inf_ideo_wcx_r))
To test our hypothesis in the ideological category condition, we conducted a preregistered independent samples t-test comparing target inferences of ideological categories in the stereotype-confirming and stereotype-disconfirming conditions. The target inferences in the disconfirming condition (M = 79.0, SD = 17.4) were not significantly lower than in the stereotype-confirming condition (M = 81.7, SD = 20.4), t(173.75) = -0.96, p = .168, d = -0.14. To investigate whether this was due to statistical outliers, we conducted another independent samples t-test excluding participants with values below 40. After exclusion, the inferences in the disconfirming condition (M = 80.1, SD = 16.0) were significantly lower than in the confirming condition (M = 86.0, SD = 12.9), t(165.59) = -2.66, p = .004, d = -0.41.
inf_ideo_dat %>%
ggplot(., aes(typicality, inf)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter(width = 0.2) +
labs(x = "Typicality", y = "Inferences") +
theme_cs_talk()
# Loop over all combinations of cutoffs and transformations
mv_ideo <- data.frame(
cutoff = c("none", "fixed 20", "fixed 30", "fixed 40",
"Mdn ± 2.5 IQR", "Mdn ± 2.0 IQR", "Mdn ± 1.5 IQR", "Mdn ± 1.0 IQR"),
cutoff_type = c("fixed", "fixed", "fixed", "fixed",
"Mdn ± IQR", "Mdn ± IQR", "Mdn ± IQR", "Mdn ± IQR"),
cutoff_value = c(0, 20, 30, 40, 2.5, 2.0, 1.5, 1.0))
for (c in seq_len(nrow(mv_ideo))) {
if (mv_ideo$cutoff_type[c] == "fixed") {
mv_dat <- inf_ideo_dat %>%
filter(inf >= mv_ideo$cutoff_value[c])
} else if (mv_ideo$cutoff_type[c] == "Mdn ± IQR") {
mv_dat <- inf_ideo_dat %>%
group_by(typicality) %>%
filter(!is_outlier(inf, coef = mv_ideo$cutoff_value[c])) %>%
ungroup()
}
# Independent samples t-test
mv_t <- mv_dat %>%
t_test(
inf ~ typicality,
alternative = "less",
ref.group = "DIS") %>%
mutate(p = formp(p, text = FALSE), df = forma(df, 2),
statistic = forma(statistic))
# Cohens d
mv_d <- mv_dat %>%
cohens_d(inf ~ typicality) %>%
pull(effsize) %>%
forma()
# Save test statistics for h1
mv_ideo$`n excluded`[c] <- nrow(inf_ideo_dat) - nrow(mv_dat)
mv_ideo$df[c] <- mv_t$df
mv_ideo$t[c] <- mv_t$statistic
mv_ideo$p[c] <- mv_t$p
mv_ideo$d[c] <- mv_d
}
# Print multiverse table
knitr::kable(select(mv_ideo, -cutoff_type, - cutoff_value), format = "markdown")
| cutoff | n excluded | df | t | p | d |
|---|---|---|---|---|---|
| none | 0 | 173.75 | -0.96 | .168 | -0.14 |
| fixed 20 | 3 | 174.91 | -1.97 | .025 | -0.30 |
| fixed 30 | 5 | 172.58 | -2.13 | .017 | -0.32 |
| fixed 40 | 8 | 165.59 | -2.66 | .004 | -0.41 |
| Mdn ± 2.5 IQR | 3 | 174.91 | -1.97 | .025 | -0.30 |
| Mdn ± 2.0 IQR | 5 | 168.98 | -2.66 | .004 | -0.40 |
| Mdn ± 1.5 IQR | 8 | 160.39 | -3.16 | <.001 | -0.48 |
| Mdn ± 1.0 IQR | 9 | 160.81 | -3.00 | .002 | -0.46 |
To investigate the robustness of the effect across various outlier exclusion criteria, we conducted a multiverse analysis. We again performed independent samples t-tests across several fixed and distribution-based thresholds. The effect was significant across all seven thresholds. These results again suggest that the effect may have been obscured due to model outliers.
inf_ideo_dat_ex %>%
ggplot(., aes(sample = inf)) +
labs(x = "Theoretical quantiles", y = "Data quantiles") +
stat_qq(color = "#000000") +
stat_qq_line(color = "#000000") +
facet_grid(~ typicality, labeller = "label_value") +
theme_cs_talk()
knitr::kable(inf_ideo_desc, format = "markdown")
| typicality | variable | n | mean | sd | ci95_low | ci95_upp |
|---|---|---|---|---|---|---|
| DIS | inf | 90 | 79.022 | 17.417 | 75.42361 | 82.62039 |
| CON | inf | 90 | 81.744 | 20.391 | 77.53118 | 85.95682 |
inf_ideo_dat_ex %>%
mutate(inf_bin = cut(inf,
breaks = 9, labels = FALSE, include.lowest = TRUE)) %>%
ggplot() +
geom_bar(position = position_dodge(preserve = "single"),
aes(x = inf_bin, y = after_stat(prop), fill = typicality),
width = 0.8) +
labs(title = "Histogramm of Inference Scores",
x = "Inference Scores",
y = "Proportion", fill = "Typicality") +
scale_fill_manual(values = c("#849AB9", "#465263")) +
theme_cs_talk() +
theme(axis.text.x = element_blank())
# Prepare data
inf_ideo_ideo_dat <- inf_ideo_dat_ex %>%
mutate(ideo_cat = case_when(
political_ideology < 35 ~ "CON",
political_ideology > 65 ~ "LIB",
.default = "MOD"
))
# Descriptives
inf_ideo_ideo_desc <- inf_ideo_ideo_dat %>%
group_by(ideo_cat, typicality) %>%
get_summary_stats(inf, type = "mean_sd") %>%
mutate(
ci95_low = mean - 1.96 * sd / sqrt(n),
ci95_upp = mean + 1.96 * sd / sqrt(n))
# Run ANOVA
inf_ideo_ideo_mod <- inf_ideo_ideo_dat %>%
anova_test(dv = inf,
effect.size = "pes",
between = c(ideo_cat, typicality)) %>%
as_tibble() %>%
rowwise() %>%
mutate(F = forma(`F`, 2), p = formp(p), pes = forma(pes, 3, FALSE))
knitr::kable(inf_ideo_ideo_mod, format = "markdown")
| Effect | DFn | DFd | F | p | p<.05 | pes |
|---|---|---|---|---|---|---|
| ideo_cat | 2 | 166 | 7.03 | .001 | * | .078 |
| typicality | 1 | 166 | 6.38 | .012 | * | .037 |
| ideo_cat:typicality | 2 | 166 | 0.97 | .381 | .012 |
It may be possible that the effect of diagnosticity on the inferences depends on the participant’s ideological category. To test this, we categorized participants with an ideology score of below 35 as conservatives, those with a score higher than 65 as liberals, and those in between as moderates and then conducted a two-way between-subjects ANOVA with the target inference as the dependent variable and the participants’ ideological identification and typicality as between-subjects factors. We found a significant effect of ideological category, which was likely driven by lower inference scores of ideological moderates. Moreover, we found a significant effect of typicality.
# Prepare data
inf_ideo_fit_dat <- inf_ideo_ideo_dat %>%
# Select participants who identify as liberal or conservative
filter(ideo_cat %in% c("CON", "LIB")) %>%
# Create ideological fit variable
mutate(ideo_fit = as.factor(ifelse(
ideo_cat == target_category_label,
"same", "different")))
# Descriptives
inf_ideo_fit_desc <- inf_ideo_fit_dat %>%
group_by(ideo_fit, typicality) %>%
get_summary_stats(inf, type = "mean_sd") %>%
mutate(
ci95_low = mean - 1.96 * sd / sqrt(n),
ci95_upp = mean + 1.96 * sd / sqrt(n))
# Run ANOVA
inf_ideo_fit_mod <- inf_ideo_fit_dat %>%
anova_test(dv = inf,
effect.size = "pes",
between = c(ideo_fit, typicality)) %>%
as_tibble() %>%
rowwise() %>%
mutate(F = forma(`F`, 2), p = formp(p), pes = forma(pes, 3, FALSE))
knitr::kable(inf_ideo_fit_mod, format = "markdown")
| Effect | DFn | DFd | F | p | p<.05 | pes |
|---|---|---|---|---|---|---|
| ideo_fit | 1 | 111 | 0.66 | .420 | .006 | |
| typicality | 1 | 111 | 2.20 | .140 | .019 | |
| ideo_fit:typicality | 1 | 111 | 0.78 | .378 | .007 |
It may be possible that the effect of diagnosticity on the inferences depends on the fit between the participant’s ideological category and the target’s category. We therefore ran an exploratory analysis examining the role of ideological fit (i.e., whether participant and target belong to the same category or a different one). We excluded participants classified as moderates and conducted a two-way between-subjects ANOVA with the target inference as the dependent variable and ideological fit and typicality as between-subjects factors. None of the effects were significant.
# Descriptives
inf_ideo_issue_desc <- inf_ideo_dat_ex %>%
group_by(issue, typicality) %>%
get_summary_stats(inf, type = "mean_sd") %>%
mutate(
ci95_low = mean - 1.96 * sd / sqrt(n),
ci95_upp = mean + 1.96 * sd / sqrt(n))
# Run ANOVA
inf_ideo_issue_mod <- inf_ideo_dat_ex %>%
anova_test(dv = inf,
effect.size = "pes",
between = c(issue, typicality)) %>%
as_tibble() %>%
rowwise() %>%
mutate(F = forma(`F`, 2), p = formp(p), pes = forma(pes, 3, FALSE))
knitr::kable(inf_ideo_issue_mod, format = "markdown")
| Effect | DFn | DFd | F | p | p<.05 | pes |
|---|---|---|---|---|---|---|
| issue | 2 | 166 | 6.42 | .002 | * | .072 |
| typicality | 1 | 166 | 7.59 | .007 | * | .044 |
| issue:typicality | 2 | 166 | 0.79 | .455 | .009 |
We ran another exploratory analysis to test whether the effect of diagnosticity on the inferences was moderated by the issue for which diagnosticity was manipulated. We conducted a two-way between-subjects ANOVA with the target inference as the dependent variable and issue and typicality as between-subjects factors. We found a significant effect for issue that was likely driven by lower inferences in the affirmative action condition. Moreover, we found a significant effect of typicality. The interaction was not significant.
inf_ideo_typ_dat <- inf_ideo_dat_ex %>%
group_by(typicality) %>%
mutate(typicality_bin = ntile(typicality_target, 2)) %>%
mutate(typicality_bin = factor(typicality_bin, labels = c("low", "high"))) %>%
ungroup() %>%
mutate(typicality_target_c = typicality_target - mean(typicality_target))
# DV: Typicality
# Descriptives
typ_ideo_desc <- inf_ideo_typ_dat %>%
group_by(typicality) %>%
get_summary_stats(typicality_target, type = "mean_sd") %>%
mutate(
ci95_low = mean - 1.96 * sd / sqrt(n),
ci95_upp = mean + 1.96 * sd / sqrt(n))
# Independent samples t-test
typ_ideo_t <- inf_ideo_typ_dat %>%
t_test(
typicality_target ~ typicality,
alternative = "less",
ref.group = "DIS") %>%
mutate(p = formp(p, text = TRUE), df = forma(df, 2),
statistic = forma(statistic))
# Cohens d
typ_ideo_d <- inf_ideo_typ_dat %>%
cohens_d(typicality_target ~ typicality, paired = FALSE) %>%
pull(effsize) %>%
forma()
# Print t-test
typ_ideo_report <- paste0("t(", typ_ideo_t$df, ") = ",
typ_ideo_t$statistic, ", ", typ_ideo_t$p, ", d = ",
typ_ideo_d)
# DV: Inferences
inf_ideo_typ_desc <- inf_ideo_typ_dat %>%
group_by(typicality_bin, typicality) %>%
get_summary_stats(inf, type = "mean_sd") %>%
mutate(
ci95_low = mean - 1.96 * sd / sqrt(n),
ci95_upp = mean + 1.96 * sd / sqrt(n))
knitr::kable(inf_ideo_typ_desc, format = "markdown")
| typicality | typicality_bin | variable | n | mean | sd | ci95_low | ci95_upp |
|---|---|---|---|---|---|---|---|
| DIS | low | inf | 44 | 79.909 | 15.138 | 75.43601 | 84.38199 |
| CON | low | inf | 42 | 84.071 | 11.604 | 80.56155 | 87.58045 |
| DIS | high | inf | 44 | 80.318 | 16.930 | 75.31550 | 85.32050 |
| CON | high | inf | 42 | 87.929 | 13.978 | 83.70157 | 92.15643 |
# Run regression
contrasts(inf_ideo_typ_dat$typicality) <- c(1, 0)
inf_ideo_typ_mod <- lm(
data = inf_ideo_typ_dat,
inf ~ typicality * typicality_target)
inf_ideo_typ_mod_coefs <- summary(inf_ideo_typ_mod)$coefficients %>%
as_tibble() %>%
rowwise() %>%
transmute(
predictor = "var", estimate = forma(`Estimate`, 2),
SE = forma(`Std. Error`, 2), t = forma(`t value`, 2),
p = formp(`Pr(>|t|)`)) %>%
ungroup() %>%
mutate(predictor = c("intercept", "typicality CON -> DIS",
"typicality rating LOW -> HIGH", "interaction"))
# Plot
inf_ideo_typ_dat %>%
ggplot(aes(x = typicality_target, y = inf, color = typicality)) +
geom_point(alpha = 0.6) +
geom_smooth(method = "lm", se = TRUE,
aes(group = typicality)) +
labs(
title = "Moderation by typicality rating",
x = "Typicality rating",
y = "Inference",
color = "Typicality") +
scale_color_manual(values = c("#2d85ff", "#28ea96")) +
theme_cs_talk()
The target person in the disconfirming condition was rated as significantly less typical than in the confirming condition, t(164.67) = -6.91, p < .001, d = -1.05. To explore the effect of the typicality ratings, we ran a linear regression model with the inferences as the dependent variable and typicality, the typicality rating, and the interaction as the predictors. We used treatment-coding for typicality and centered the typicality ratings. The overall model was not significant, F(3, 168) = 2.47, p = .064, R2 = .042. The main effect typicality was not significant, b = -0.52, SE = 8.44, t(168) = -0.06, p = .951. The main effect of the typicality rating was not significant, b = 0.05, SE = 0.09, t(168) = 0.56, p = .578. Lastly, there was no significant interaction, b = -0.07, SE = 0.11, t(168) = -0.66, p = .511.
# Inference data
inf_stereoprob_ideo_dat <- dat %>%
filter(diagnosticity_component == "STEREOPROB") %>%
filter(category_type == "IDEOLOGICAL") %>%
filter(item_type == "target")
# Descriptives
inf_stereoprob_ideo_desc <- inf_stereoprob_ideo_dat %>%
group_by(typicality) %>%
get_summary_stats(inf, type = "mean_sd") %>%
mutate(
ci95_low = mean - 1.96 * sd / sqrt(n),
ci95_upp = mean + 1.96 * sd / sqrt(n))
# Independent samples t-test
inf_stereoprob_ideo_t <- inf_stereoprob_ideo_dat %>%
t_test(
inf ~ typicality,
alternative = "less",
ref.group = "DIS") %>%
mutate(p = formp(p, text = TRUE), df = forma(df, 2),
statistic = forma(statistic))
# Cohens d
inf_stereoprob_ideo_d <- inf_stereoprob_ideo_dat %>%
cohens_d(inf ~ typicality, paired = FALSE) %>%
pull(effsize) %>%
forma()
# Print t-test
inf_stereoprob_ideo_report <- paste0("t(", inf_stereoprob_ideo_t$df, ") = ",
inf_stereoprob_ideo_t$statistic, ", ", inf_stereoprob_ideo_t$p, ", d = ",
inf_stereoprob_ideo_d)
# EXCLUDE OUTLIERS
inf_stereoprob_ideo_dat_ex <- inf_stereoprob_ideo_dat %>%
filter(inf >= 40)
# Descriptives
inf_stereoprob_ideo_ex_desc <- inf_stereoprob_ideo_dat_ex %>%
group_by(typicality) %>%
get_summary_stats(inf, type = "mean_sd") %>%
mutate(
ci95_low = mean - 1.96 * sd / sqrt(n),
ci95_upp = mean + 1.96 * sd / sqrt(n))
# Independent samples t-test
inf_stereoprob_ideo_ex_t <- inf_stereoprob_ideo_dat_ex %>%
t_test(
inf ~ typicality,
alternative = "less",
ref.group = "DIS") %>%
mutate(p = formp(p, text = TRUE), df = forma(df, 2),
statistic = forma(statistic))
# Cohens d
inf_stereoprob_ideo_ex_d <- inf_stereoprob_ideo_dat_ex %>%
cohens_d(inf ~ typicality, paired = FALSE) %>%
pull(effsize) %>%
forma()
# Print t-test
inf_stereoprob_ideo_ex_report <- paste0("t(", inf_stereoprob_ideo_ex_t$df, ") = ",
inf_stereoprob_ideo_ex_t$statistic, ", ", inf_stereoprob_ideo_ex_t$p, ", d = ",
inf_stereoprob_ideo_ex_d)
# WILCOXON
# One-sided Wilcoxon rank-sum test
inf_stereoprob_ideo_wcx <- inf_stereoprob_ideo_dat %>%
wilcox_test(
inf ~ typicality,
alternative = "less",
ref.group = "DIS")
# Wilcoxon effect size r
inf_stereoprob_ideo_wcx_r <- inf_stereoprob_ideo_dat %>%
wilcox_effsize(inf ~ typicality,
ref.group = "DIS") %>%
pull(effsize)
# Print wilcoxon test
inf_stereoprob_ideo_wcx_report <- paste0("Z = ",
forma(inf_stereoprob_ideo_wcx_r * sqrt(nrow(inf_stereoprob_ideo_dat) / 2)),
", ", formp(inf_stereoprob_ideo_wcx$p, TRUE),
", r = ", forma(inf_stereoprob_ideo_wcx_r))
To test the effect of manipulating the stereotype probability component of diagnosticity on inferences of partisanship, we conducted an exploratory independent samples t-test comparing target inferences of ideological categories in the stereotype-confirming and stereotype-disconfirming conditions. The target inferences in the disconfirming condition (M = 76.8, SD = 20.9) were not significantly lower than in the stereotype-confirming condition (M = 81.2, SD = 19.4), t(177.09) = -1.46, p = .074, d = -0.22. To investigate whether this was due to statistical outliers, we conducted another independent samples t-test excluding participants with values below 40. After exclusion, the inferences in the disconfirming condition (M = 80.5, SD = 14.6) were still not significantly lower than in the confirming condition (M = 84.0, SD = 14.6), t(168.98) = -1.57, p = .059, d = -0.24.
inf_stereoprob_ideo_dat %>%
ggplot(., aes(typicality, inf)) +
geom_boxplot(outlier.shape = NA) +
geom_jitter(width = 0.2) +
labs(x = "Typicality", y = "Inferences") +
theme_cs_talk()
# Loop over all combinations of cutoffs and transformations
mv_stereoprob_ideo <- data.frame(
cutoff = c("none", "fixed 20", "fixed 30", "fixed 40",
"Mdn ± 2.5 IQR", "Mdn ± 2.0 IQR", "Mdn ± 1.5 IQR", "Mdn ± 1.0 IQR"),
cutoff_type = c("fixed", "fixed", "fixed", "fixed",
"Mdn ± IQR", "Mdn ± IQR", "Mdn ± IQR", "Mdn ± IQR"),
cutoff_value = c(0, 20, 30, 40, 2.5, 2.0, 1.5, 1.0))
for (c in seq_len(nrow(mv_stereoprob_ideo))) {
if (mv_stereoprob_ideo$cutoff_type[c] == "fixed") {
mv_dat <- inf_stereoprob_ideo_dat %>%
filter(inf >= mv_stereoprob_ideo$cutoff_value[c])
} else if (mv_stereoprob_ideo$cutoff_type[c] == "Mdn ± IQR") {
mv_dat <- inf_stereoprob_ideo_dat %>%
group_by(typicality) %>%
filter(!is_outlier(inf, coef = mv_stereoprob_ideo$cutoff_value[c])) %>%
ungroup()
}
# Independent samples t-test
mv_t <- mv_dat %>%
t_test(
inf ~ typicality,
alternative = "less",
ref.group = "DIS") %>%
mutate(p = formp(p, text = FALSE), df = forma(df, 2),
statistic = forma(statistic))
# Cohens d
mv_d <- mv_dat %>%
cohens_d(inf ~ typicality) %>%
pull(effsize) %>%
forma()
# Save test statistics for h1
mv_stereoprob_ideo$`n excluded`[c] <-
nrow(inf_stereoprob_ideo_dat) - nrow(mv_dat)
mv_stereoprob_ideo$df[c] <- mv_t$df
mv_stereoprob_ideo$t[c] <- mv_t$statistic
mv_stereoprob_ideo$p[c] <- mv_t$p
mv_stereoprob_ideo$d[c] <- mv_d
}
# Print multiverse table
knitr::kable(select(mv_stereoprob_ideo, -cutoff_type, - cutoff_value),
format = "markdown")
| cutoff | n excluded | df | t | p | d |
|---|---|---|---|---|---|
| none | 0 | 177.09 | -1.46 | .074 | -0.22 |
| fixed 20 | 6 | 171.66 | -1.16 | .123 | -0.18 |
| fixed 30 | 9 | 168.98 | -1.57 | .059 | -0.24 |
| fixed 40 | 9 | 168.98 | -1.57 | .059 | -0.24 |
| Mdn ± 2.5 IQR | 5 | 171.02 | -0.83 | .204 | -0.13 |
| Mdn ± 2.0 IQR | 7 | 169.22 | -0.93 | .176 | -0.14 |
| Mdn ± 1.5 IQR | 9 | 168.98 | -1.57 | .059 | -0.24 |
| Mdn ± 1.0 IQR | 19 | 157.70 | -0.67 | .253 | -0.10 |
To investigate the robustness of the effect across various outlier exclusion criteria, we conducted a multiverse analysis. We again performed independent samples t-tests across several fixed and distribution-based thresholds. The effect was not significant for all of the seven thresholds. These results suggest that manipulating the stereotype probability has no effect on inferences of partisanship.
inf_stereoprob_ideo_dat %>%
ggplot(., aes(sample = inf)) +
labs(x = "Theoretical quantiles", y = "Data quantiles") +
stat_qq(color = "#000000") +
stat_qq_line(color = "#000000") +
facet_grid(~ typicality, labeller = "label_value") +
theme_cs_talk()
knitr::kable(inf_stereoprob_ideo_desc, format = "markdown")
| typicality | variable | n | mean | sd | ci95_low | ci95_upp |
|---|---|---|---|---|---|---|
| DIS | inf | 90 | 76.789 | 20.881 | 72.47494 | 81.10306 |
| CON | inf | 90 | 81.167 | 19.433 | 77.15210 | 85.18190 |
inf_stereoprob_ideo_dat %>%
mutate(inf_bin = cut(inf,
breaks = 9, labels = FALSE, include.lowest = TRUE)) %>%
ggplot() +
geom_bar(position = position_dodge(preserve = "single"),
aes(x = inf_bin, y = after_stat(prop), fill = typicality),
width = 0.8) +
labs(title = "Histogramm of Inference Scores",
x = "Inference Scores",
y = "Proportion", fill = "Typicality") +
scale_fill_manual(values = c("#849AB9", "#465263")) +
theme_cs_talk() +
theme(axis.text.x = element_blank())
diag <- data.frame(
probsA = rep(seq(0.1, 0.9, 0.02), each = 41),
probsB = rep(seq(0.1, 0.9, 0.02), times = 41),
base = NA,
incr_counter = NA,
decr_stereo = NA
) %>%
filter(probsA >= probsB)
incr <- .063
for (i in seq_len(nrow(diag))) {
probA <- diag$probsA[i]
probB <- diag$probsB[i]
diag$base[i] <- probA / (probA + probB)
diag$incr_counter[i] <- probA / (probA + (probB + incr))
diag$decr_stereo[i] <- (probA - incr) / ((probA - incr) + probB)
}
diag_p <- diag %>%
mutate(
incr_counter_effect = base - incr_counter,
decr_stereo_effect = base - decr_stereo)
library(plot3D)
par(mfrow = c(2, 1))
scatter3D(x = diag_p$probsA, y = diag_p$probsB, z = diag_p$decr_stereo_effect,
pch = 16,
xlab = "Stereotype Probability",
ylab = "Counterstereotype Probability",
zlab = "Change in Diagnosticity",
colvar = diag_p$base,
col = ramp.col(col = c("#2d85ff", "#28ea96"), n = 100, alpha = 1),
main = "Change in diagnosticity due to decreasing stereotype probability")
scatter3D(x = diag_p$probsB, y = diag_p$probsA, z = diag_p$incr_counter_effect,
pch = 16,
xlab = "Counterstereotype Probability",
ylab = "Stereotype Probability",
zlab = "Change in Diagnosticity",
colvar = diag_p$base,
col = ramp.col(col = c("#2d85ff", "#28ea96"), n = 100, alpha = 1),
main = "Change in diagnosticity due to increasing counterstereotype probability")
diag_dat <- dat %>%
filter(item_type == "target") %>%
filter(inf >= 40)
# Export data for simulation
# saveRDS(diag_dat, "Preregistration/sim_dat.RDS")
Decreasing the stereotype probability leads to larger decreases of diagnosticity the smaller the stereotype probability and the closer the two probabilities are together. Generally, these are cases with low diagnosticity. Increasing the counterstereotype probability the other hand leads to larger decreases of diagnosticity the smaller the counterstereotype probability and the further the two probabilities are apart. Generally these are cases with high diagnosticity. As the opinions presented in the current study are highly diagnostic, it would be expected on account of the diagnosticity formula, that decreasing the stereotype probability produces smaller decrease in diagnosticity compared to increasing the counterstereotype probability.
# Stereotype data
like_dat <- dat %>%
select(subject_id, target_category_label, target_category,
partisan_identity, political_ideology,
category_type, typicality, issue, matches("_like_")) %>%
unique() %>%
# Exclude participants who gave same rating in all likability ratings
rowwise() %>%
filter(
!all(
t1_like_nontarget == t1_like_target,
t1_like_target == t2_like_nontarget,
t2_like_nontarget == t2_like_target)) %>%
ungroup() %>%
pivot_longer(
cols = matches("_like_"),
names_to = c("time", ".value", "person"),
names_pattern = "([a-z0-9]*)_(like)_(.*)") %>%
mutate(participant_category = case_when(.default = "OTH",
category_type == "PARTY" &
partisan_identity == "Independent" ~ "IND",
category_type == "PARTY" &
partisan_identity == "Republican" ~ "REP",
category_type == "PARTY" &
partisan_identity == "Democrat" ~ "DEM",
category_type == "IDEOLOGICAL" &
(political_ideology <= 65 & political_ideology >= 35) ~ "MOD",
category_type == "IDEOLOGICAL" &
(political_ideology > 65) ~ "LIB",
category_type == "IDEOLOGICAL" &
(political_ideology < 35) ~ "CON")) %>%
filter(!participant_category %in% c("MOD", "IND", "OTH")) %>%
mutate(category_fit = as.factor(ifelse(
participant_category == target_category_label,
"ingroup", "outgroup"))) %>%
filter(person == "target")
# Descriptives
like_desc <- like_dat %>%
group_by(category_fit, typicality, time) %>%
get_summary_stats(like, type = "mean_sd") %>%
mutate(
ci95_low = mean - 1.96 * sd / sqrt(n),
ci95_upp = mean + 1.96 * sd / sqrt(n))
# Run ANOVA
like_mod <- like_dat %>%
anova_test(dv = like,
wid = subject_id,
effect.size = "pes",
within = time,
between = c(category_fit, typicality)) %>%
as_tibble() %>%
rowwise() %>%
mutate(F = forma(`F`, 2), p = formp(p), pes = forma(pes, 3, FALSE))
like_dat %>%
ggplot(., aes(typicality, like, color = time)) +
facet_wrap(~ category_fit) +
geom_boxplot(outlier.shape = NA) +
geom_point(position = position_jitterdodge()) +
labs(x = "Typicality", y = "Stereotype") +
scale_color_manual(values = c("#849AB9", "#465263")) +
theme_cs_talk()
like_dat %>%
ggplot(., aes(sample = like)) +
labs(x = "Theoretical quantiles", y = "Data quantiles") +
stat_qq(color = "#000000") +
stat_qq_line(color = "#000000") +
facet_grid(category_fit * typicality ~ time, labeller = "label_value") +
theme_cs_talk()
knitr::kable(like_desc, format = "markdown")
| typicality | time | category_fit | variable | n | mean | sd | ci95_low | ci95_upp |
|---|---|---|---|---|---|---|---|---|
| DIS | t1 | ingroup | like | 61 | 77.951 | 16.842 | 73.72446 | 82.17754 |
| DIS | t2 | ingroup | like | 61 | 68.148 | 25.608 | 61.72161 | 74.57439 |
| CON | t1 | ingroup | like | 67 | 78.030 | 18.292 | 73.64994 | 82.41006 |
| CON | t2 | ingroup | like | 67 | 81.015 | 18.079 | 76.68595 | 85.34405 |
| DIS | t1 | outgroup | like | 56 | 62.321 | 29.248 | 54.66048 | 69.98152 |
| DIS | t2 | outgroup | like | 56 | 69.304 | 24.596 | 62.86191 | 75.74609 |
| CON | t1 | outgroup | like | 55 | 65.418 | 23.418 | 59.22894 | 71.60706 |
| CON | t2 | outgroup | like | 55 | 49.964 | 26.728 | 42.90015 | 57.02785 |
knitr::kable(like_mod, format = "markdown")
| Effect | DFn | DFd | F | p | p<.05 | pes |
|---|---|---|---|---|---|---|
| category_fit | 1 | 235 | 31.32 | <.001 | * | .118 |
| typicality | 1 | 235 | 0.10 | .751 | .000 | |
| time | 1 | 235 | 6.76 | .010 | * | .028 |
| category_fit:typicality | 1 | 235 | 7.90 | .005 | * | .033 |
| category_fit:time | 1 | 235 | 0.08 | .779 | .000 | |
| typicality:time | 1 | 235 | 2.69 | .102 | .011 | |
| category_fit:typicality:time | 1 | 235 | 35.89 | <.001 | * | .132 |
like_dat %>%
mutate(like_bin = cut(like,
breaks = 9, labels = FALSE, include.lowest = TRUE)) %>%
ggplot() +
geom_bar(position = position_dodge(preserve = "single"),
aes(x = like_bin, y = after_stat(prop), fill = time), width = 0.8) +
labs(title = "Histogramm of Stereotype Scores",
x = "Likability (binned)",
y = "Proportion", fill = "Time") +
facet_grid(category_fit ~ typicality, labeller = "label_value") +
scale_fill_manual(values = c("#849AB9", "#465263")) +
theme_cs_talk() +
theme(axis.text.x = element_blank())